Chi-Chun Pan, The Pennsylvania State University, cpan@ist.psu.edu [PRIMARY contact]
Don Pellegrino, Drexel University, don@drexel.edu
Chris Weaver, The Pennsylvania State University, cweaver@psu.edu [Faculty advisor]
Prasenjit Mitra, The Pennsylvania State University, pmitra@ist.psu.edu [Faculty advisor]
Student team:
NO
Improvise
is a desktop application for building and browsing a wide
range of flexible and powerful visual analysis tools. Live design of visual
queries facilitates fast and flexible interactive drill-down into fine-grain
relationships buried in spatiotemporal and social network information spread
across multiple data sets. Cross-filtering queries across multiple views
provide analysts with the means to seek out and dissect subtle patterns in
complex information spaces.
We used Improvise to build wiki.viz, an interactive visualization of the wiki edit
history data. The interface enables analysis of the unstructured data and the
information extracted with automated text processes such as reverting networks
and word senses.
Two Page Summary:
NO
ANSWERS:
Wiki-1:
What are the factions represented in the edit pages and who are its members? In
other words, describe the groups and their members based on their editing
changes.
GROUP
|
Names
|
Supporters
|
VictoriaV,
RyogaNica, Amado
|
Challengers
|
Edemir,
Rm99, DailosTamanca
|
Neutral
contributors
|
Agustin,
Sara
|
Page
removers
|
195.113.65.x,
Alejo, 75.179.21.x, 84.158.202.x,Alejandrosanchez, 209.155.27.x, Molotover, 66.175.135.x,
201.226.51.x, Honoratas, 74.120.3.x, 85.135.211.x, 67.55.3.x, Rosamaria,
Absalon, 204.52.215.x 74.130.152.x139.55.50.x, 69.14.85.x
|
People
who only revert edits
|
Kurrop,
Seina
|
Detailed
Answer:
Click
here to download the video that accompanies this answer.
In the first step, we created a set of rules to extract the
revert patterns in the summary field. We classified the revert patterns in to
three types. The first type is “revert”. For these patterns, the authors of
those edits undid others’ work and returned the page to previous versions. For
example, if the pattern “Reverted 1 edit by [author2] …” appears, we can obtain
a triple (author, revert, author2). The second type of patterns is “undo”. The
difference between undo and revert is that by doing undo, users can revert a
single edit without simultaneously undoing all constructive changes that have
been made since. An example of an “undo” pattern is “Undid revision xxxxxxxxx
by [author2] … ”Both“ revert ” and “ undo ” patterns indicate that the authors
edited the page disagreed the changes made by author2. The third type of revert
patterns is “revertTo”. Unlike the other two types of patterns, a “revertTo”
edit might indicate that the authors edited the page think the changes made by
author2 could be better than the current versions. Sometimes, a “revertTo”
pattern may appear with a “revert” pattern. For example, in a summary contains
the pattern “reverting possible vandalism by [author2] to last good revision by
[author3], we could obtain two triples (author, revert, author2) and (author,
revertTo, author3). By extracting the patterns, we can create a reverting graph
by creating nodes as authors and linking nodes with the three types of patterns
extracted from wiki history. We
have created a graph view to visualize the reverting graph (see Figure 1). By
selecting authors from the list of edit authors and revision authors, we can
visually analyze the interaction between users.
Figure
1. Graph View of the reverting network. On the
right hand side, analysts can select authors of interest and revision types for
analysis.
In order to identify groups in the wiki edit history, we
started with the Discussion page. We first tried to categorize authors into
three groups: supporters, challengers, and neutral contributors. By reading the
discussion page, we obtained a list of users contributed to the discussion page
(see Table 1) and their opinions regarding to the Paraiso Menifesto. There are
19 distinct authors in the discussion page. We remove authors who only appear
once in the discussion page because all of them are inactive in the dataset.
Table 1.
List of users appear in the discussion page
User |
Topic |
Agustin |
Lacks
Objectivity and Neutrality |
VictoriaV |
Lacks
Objectivity and Neutrality, Civility |
Edemir |
Lacks
Objectivity and Neutrality, Edit warring |
Rm99 |
POV
Pushing, Civility |
81.96.243.x |
Civility |
Afrox |
sect
status |
Cisco |
sect
status |
Sierra |
Edit
warring |
SqueakBox |
Edit
warring |
GROUP
- SUPPORTERS
The supporters are the people who agree the point of view of
the Paraiso Menifesto. Among these users, VictoriaV is the most active one. Although,
in the discussion, VictoriaV appeared to pretend that she is neutral with
respect to the Paraiso Menifesto, she had many arguments with others in the
discussion page that belie that claim. For example, VicotriaV argued about the
source of Catalano’s enlightenment with Agustin and Edemir. In addition, Rm99
claimed that VictoriaV have lied about her background and her relation to the
Paraiso Menifesto. It is obvious that she supports Catalano’s point of view.
Therefore, we tag VictoriaV as a supporter of the Paraiso movement.
Among 387 users, most users only made one edit. It would be
difficult to judge their opinion about the Paraiso Menifesto. Therefore, for
analyzing the reverting graph, we only select users who made more than 3 edits.
With time filtering, we can apply animation to replay the interaction between
users in the reverting graph. The types of reverting actions are revert, undo,
and revertTo. We can filter on type of revision by selecting Revision Types on
the visualization. We start our analysis with the users found in the previous
analysis of discussion page. If we find two users revert/undo each other
several times (the width of edges indicate number of revision occurred), we put
them into different groups. On the other hand, if we see revertTo link between
two users, we tag them as the same group. By doing this, we found two more
users in the supporters group, RyogaNica and Amado. Based on the reverting
graph, we found that RyogaNica made several revisions on Rm99’s,
DailosTamanca’s, and Edmir’s edits (they are identified as challengers). Amado made lots of edits and VictoriaV
made a revertTo revision to Amado’s edit. By reading Amado’s comments on the
table view of wiki edit (Figure 2). We suspect that Amado is a supporter of the
Paraiso movement and tried to maintain the wiki page.
Figure
2. Table view of the wiki edit history sorted by
source (author).
GROUP
- CHALLENGERS
By tagging VictoriaV as a supporter, we could easily tag
Rm99 and Edmir as challengers based on their wording in the discussion page.
Rm99 questioned the credibility of the external links provided by VictoriaV.
Also, Rm99 claimed that “The controversy surrounding Catalano has been
downplayed considerably.” Edmir. In addition, based on the conversation in the
Civility section of the discussion page, we suspect that Rm99 sometimes entered
the wiki page with IP address 81.96.243.x. By analyzing the reverting graph, we
confirm that Edmir and VicotriaV have different agenda regarding to the Paraiso
movement. We also indentify DailosTamanca as a challenger because DailosTamanca
and RyogaNica make several reverts to each other
GROUP
- NEUTRAL CONTRIBUTORS
If we could not clearly judge a user’s agenda, we can select
the user on the author list and filter the edits with the author name. This
allows us to read all the edits made by the user. Using this approach, we think
Agustin and Sara may be normal wiki users who tried to maintain the neutral
point of view for the wiki page.
GROUP
- PAGE REMOVERS
We also found some interesting patterns by creating time
series visualization on the size of wiki page. We found several sudden decrease
of the size of the page on the time series of page size (Figure 3). The time
series on the bottom indicates the maximum, minimum, and average page size for
everyday. By sorting the edits with the size of page on the table view of wiki
edit history, we found a group of users who did not contribute to the content
of the wiki page. Instead, they replaced the wiki page with short sentences.
Most of them did not login with a registered username. We think those users may
fit the profile of the protesters mentioned in the wiki page.
Figure 3. Time series view of the page size (the time
series on the bottom) indicates the maximum, minimum, and average page size for
everyday.
GROUP
- PEOPLE WHO ONLY REVERT EDITS
Finally, we analyze the frequency of a user appear in the
reverting graph and the number of edits made by the user. We found two persons
(Kurrop and Seina) who only revise others’ edits. They did not add any
information into the wiki page.
Wiki-2:
Is the Paraiso movement involved in violent activities?
Yes
List
of wiki edits providing evidence
# (cur) (last) 03:16, 19 September 2006 Alphanzo (Talk |
contribs) m (moved Paraiso to GUNNED DOWN SIX DOCTORS AND NURSES IN COLD BLOOD)
Short
Answer:
In order to identify violent activities, we extract the
words used by users in the wiki edit history. Our hypothesis is that violent
activities could be recognized with some cue words. By remove stop words and
applying stemming algorithm, we obtained 1279 unique words. We further applied
natural language processing tools on the comments in the wiki edit history and
assigned part of speech tags for the 1279 words. By inspecting those 1279
words, we manually assign a real value between -1.0 and 1.0 to selected words
(the default value for every word is 0.0). A negative value indicates that a
word has higher possibility to be used to describe violent activities (such as
hate and attack). A positive value indicates that a word has higher possibility
to be used to describe non-violent activities (such as good and agree). By
selecting users and words of interest from the table view of authors and words,
the graph view becomes an author-word graph which provides an intuitive way to
visualize who use what words in the wiki edit history (Figure 4). By analyzing
the author-word graph and the summary of wiki edit history with the steps to
identify groups of people, we found the edit which may link the Paraiso
movement to violent activities.
# (cur) (last) 03:16, 19 September 2006 Alphanzo (Talk |
contribs) m (moved Paraiso to GUNNED DOWN SIX DOCTORS AND NURSES IN COLD BLOOD)
It is possible that someone who has firsthand knowledge
about the execution tried to reveal Paraiso movement’s crime activity. By
reading the posts before Alphanzo’s edit, we suspect the confrontation of
Paraiso members and Dept. of Health may cause the violent activity.
# (cur) (last) 03:09, 19 September 2006 Edemir (Talk |
contribs) (97,530 bytes) (?Home Health Care - added confrontation of Paraiso
members and Dept of Health)
We also suspect this message may be related to the incident
that happened in the evacuation mini challenge.
Other wiki edits that point to potential violent activities
include the reference to prosecution of Paraiso by Belgians and Catalano's
death. However, there is no direct evidence of violence associated with any of
these events.
# (cur) (last) 09:26, 4 September 2006 Angelgasperi (Talk |
contribs) (93,439 bytes) (?Controversy and criticism - Belgium prosecuting, wikinews
source)
# (cur) (last) 11:40, 12 November 2006 Danielrengelm (Talk |
contribs) (105,333 bytes) (at least since Catalano is dead, but even b4 that
others contributed as well, even if the I.C. might belittle their impact)
Figure
4. Author-word graph for visualization of who use
what words in the wiki edit history.